The TÜbİTAK-UEKAE statistical machine translation system for IWSLT 2007
نویسندگان
چکیده
We describe the TÜBITAK-UEKAE system that participated in the Arabic-to-English and Japanese-toEnglish translation tasks of the IWSLT 2007 evaluation campaign. Our system is built on the open-source phrasebased statistical machine translation software Moses. Among available corpora and linguistic resources, only the supplied training data and an Arabic morphological analyzer are used in the system. We present the run-time lexical approximation method to cope with out-of-vocabulary words during decoding. We tested our system under both automatic speech recognition (ASR) and clean transcript (clean) input conditions. Our system was ranked first in both Arabic-toEnglish and Japanese-to-English tasks under the “clean” condition.
منابع مشابه
The tÜbİTAK-UEKAE statistical machine translation system for IWSLT 2008
In this study, the TÜBİTAK-UEKAE statistical machine translation system based on the open-source phrasebased statistical machine translation software, Moses, is presented. Additionally, phrase-table augmentation is applied to maximize source language coverage; lexical approximation is applied to replace out-of-vocabulary words with known words prior to decoding; and automatic punctuation insert...
متن کاملThe tÜBITAK-UEKAE statistical machine translation system for IWSLT 2009
We describe our Arabic-to-English and Turkish-to-English machine translation systems that participated in the IWSLT 2009 evaluation campaign. Both systems are based on the Moses statistical machine translation toolkit, with added components to address the rich morphology of the source languages. Three different morphological approaches are investigated for Turkish. Our primary submission uses l...
متن کاملThe TÜBITAK-UEKAE statistical machine translation system for IWSLT 2010
We report on our participation in the IWSLT 2010 evaluation campaign. Similar to previous years, our submitted systems are based on the Moses statistical machine translation toolkit. This year, we also experimented with hierarchical phrasebased models. In addition, we utilized automatic minimum error-rate training instead of manually-guided tuning. We focused more on the BTEC Turkish-English ta...
متن کاملThe TÜBİTAK statistical machine translation system for IWSLT 2012
We describe the TÜBİTAK submission to the IWSLT 2012 Evaluation Campaign. Our system development focused on utilizing Bayesian alignment methods such as variational Bayes and Gibbs sampling in addition to the standard GIZA++ alignments. The submitted tracks are the ArabicEnglish and Turkish-English TED Talks translation tasks.
متن کاملThe CASIA phrase-based statistical machine translation system for IWSLT 2007
This paper describes our phrase-based statistical machine translation system (CASIA) used in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007. In this year's evaluation, we participated in the open data track of clean text for the Chinese-to-English machine translation. Here, we mainly introduce the overview of the system, the primary modules, th...
متن کامل